Pesquisa | Biblioteca Virtual em Saúde

1.

Attention capture by own name decreases with speech compression.

Li, Simon Y W; Lee, Alan L F; Chiu, Jenny W S; Loeb, Robert G; Sanderson, Penelope M.

Cogn Res Princ Implic ; 9(1): 29, 2024 May 12.

Artigo em Inglês | MEDLINE | ID: mdl-38735013

RESUMO

Auditory stimuli that are relevant to a listener have the potential to capture focal attention even when unattended, the listener's own name being a particularly effective stimulus. We report two experiments to test the attention-capturing potential of the listener's own name in normal speech and time-compressed speech. In Experiment 1, 39 participants were tested with a visual word categorization task with uncompressed spoken names as background auditory distractors. Participants' word categorization performance was slower when hearing their own name rather than other names, and in a final test, they were faster at detecting their own name than other names. Experiment 2 used the same task paradigm, but the auditory distractors were time-compressed names. Three compression levels were tested with 25 participants in each condition. Participants' word categorization performance was again slower when hearing their own name than when hearing other names; the slowing was strongest with slight compression and weakest with intense compression. Personally relevant time-compressed speech has the potential to capture attention, but the degree of capture depends on the level of compression. Attention capture by time-compressed speech has practical significance and provides partial evidence for the duplex-mechanism account of auditory distraction.

Assuntos

Atenção , Nomes , Percepção da Fala , Humanos , Atenção/fisiologia , Feminino , Masculino , Percepção da Fala/fisiologia , Adulto , Adulto Jovem , Fala/fisiologia , Tempo de Reação/fisiologia , Estimulação Acústica

2.

Eye movements track prioritized auditory features in selective attention to natural speech.

Gehmacher, Quirin; Schubert, Juliane; Schmidt, Fabian; Hartmann, Thomas; Reisinger, Patrick; Rösch, Sebastian; Schwarz, Konrad; Popov, Tzvetan; Chait, Maria; Weisz, Nathan.

Nat Commun ; 15(1): 3692, 2024 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-38693186

RESUMO

Over the last decades, cognitive neuroscience has identified a distributed set of brain regions that are critical for attention. Strong anatomical overlap with brain regions critical for oculomotor processes suggests a joint network for attention and eye movements. However, the role of this shared network in complex, naturalistic environments remains understudied. Here, we investigated eye movements in relation to (un)attended sentences of natural speech. Combining simultaneously recorded eye tracking and magnetoencephalographic data with temporal response functions, we show that gaze tracks attended speech, a phenomenon we termed ocular speech tracking. Ocular speech tracking even differentiates a target from a distractor in a multi-speaker context and is further related to intelligibility. Moreover, we provide evidence for its contribution to neural differences in speech processing, emphasizing the necessity to consider oculomotor activity in future research and in the interpretation of neural differences in auditory cognition.

Assuntos

Atenção , Movimentos Oculares , Magnetoencefalografia , Percepção da Fala , Fala , Humanos , Atenção/fisiologia , Movimentos Oculares/fisiologia , Masculino , Feminino , Adulto , Adulto Jovem , Percepção da Fala/fisiologia , Fala/fisiologia , Estimulação Acústica , Encéfalo/fisiologia , Tecnologia de Rastreamento Ocular

3.

Speech, voice, and language outcomes following deep brain stimulation: A systematic review.

Tabari, Fatemeh; Berger, Joel I; Flouty, Oliver; Copeland, Brian; Greenlee, Jeremy D; Johari, Karim.

PLoS One ; 19(5): e0302739, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38728329

RESUMO

BACKGROUND: Deep brain stimulation (DBS) reliably ameliorates cardinal motor symptoms in Parkinson's disease (PD) and essential tremor (ET). However, the effects of DBS on speech, voice and language have been inconsistent and have not been examined comprehensively in a single study. OBJECTIVE: We conducted a systematic analysis of literature by reviewing studies that examined the effects of DBS on speech, voice and language in PD and ET. METHODS: A total of 675 publications were retrieved from PubMed, Embase, CINHAL, Web of Science, Cochrane Library and Scopus databases. Based on our selection criteria, 90 papers were included in our analysis. The selected publications were categorized into four subcategories: Fluency, Word production, Articulation and phonology and Voice quality. RESULTS: The results suggested a long-term decline in verbal fluency, with more studies reporting deficits in phonemic fluency than semantic fluency following DBS. Additionally, high frequency stimulation, left-sided and bilateral DBS were associated with worse verbal fluency outcomes. Naming improved in the short-term following DBS-ON compared to DBS-OFF, with no long-term differences between the two conditions. Bilateral and low-frequency DBS demonstrated a relative improvement for phonation and articulation. Nonetheless, long-term DBS exacerbated phonation and articulation deficits. The effect of DBS on voice was highly variable, with both improvements and deterioration in different measures of voice. CONCLUSION: This was the first study that aimed to combine the outcome of speech, voice, and language following DBS in a single systematic review. The findings revealed a heterogeneous pattern of results for speech, voice, and language across DBS studies, and provided directions for future studies.

Assuntos

Estimulação Encefálica Profunda , Idioma , Doença de Parkinson , Fala , Voz , Estimulação Encefálica Profunda/métodos , Humanos , Doença de Parkinson/terapia , Doença de Parkinson/fisiopatologia , Fala/fisiologia , Voz/fisiologia , Tremor Essencial/terapia , Tremor Essencial/fisiopatologia

4.

Asynchronous behavioral and neurophysiological changes in word production in the adult lifespan.

Krethlow, Giulia; Fargier, Raphaël; Atanasova, Tanja; Ménétré, Eric; Laganaro, Marina.

Cereb Cortex ; 34(5)2024 May 02.

Artigo em Inglês | MEDLINE | ID: mdl-38715409

RESUMO

Behavioral and brain-related changes in word production have been claimed to predominantly occur after 70 years of age. Most studies investigating age-related changes in adulthood only compared young to older adults, failing to determine whether neural processes underlying word production change at an earlier age than observed in behavior. This study aims to fill this gap by investigating whether changes in neurophysiological processes underlying word production are aligned with behavioral changes. Behavior and the electrophysiological event-related potential patterns of word production were assessed during a picture naming task in 95 participants across five adult lifespan age groups (ranging from 16 to 80 years old). While behavioral performance decreased starting from 70 years of age, significant neurophysiological changes were present at the age of 40 years old, in a time window (between 150 and 220 ms) likely associated with lexical-semantic processes underlying referential word production. These results show that neurophysiological modifications precede the behavioral changes in language production; they can be interpreted in line with the suggestion that the lexical-semantic reorganization in mid-adulthood influences the maintenance of language skills longer than for other cognitive functions.

Assuntos

Envelhecimento , Eletroencefalografia , Potenciais Evocados , Humanos , Adulto , Idoso , Masculino , Pessoa de Meia-Idade , Feminino , Adulto Jovem , Adolescente , Idoso de 80 Anos ou mais , Envelhecimento/fisiologia , Potenciais Evocados/fisiologia , Encéfalo/fisiologia , Fala/fisiologia , Semântica

5.

Animal cognition: Dogs build semantic expectations between spoken words and objects.

Murray, Micah M; Middelmann, Naomi K; Federmeier, Kara D.

Curr Biol ; 34(9): R348-R351, 2024 May 06.

Artigo em Inglês | MEDLINE | ID: mdl-38714162

RESUMO

A recent study has used scalp-recorded electroencephalography to obtain evidence of semantic processing of human speech and objects by domesticated dogs. The results suggest that dogs do comprehend the meaning of familiar spoken words, in that a word can evoke the mental representation of the object to which it refers.

Assuntos

Cognição , Semântica , Animais , Cães/psicologia , Cognição/fisiologia , Humanos , Eletroencefalografia , Fala/fisiologia , Percepção da Fala/fisiologia , Compreensão/fisiologia

6.

The impact of face coverings on audio-visual contributions to communication with conversational speech.

Jackson, I R; Perugia, E; Stone, M A; Saunders, G H.

Cogn Res Princ Implic ; 9(1): 25, 2024 Apr 23.

Artigo em Inglês | MEDLINE | ID: mdl-38652383

RESUMO

The use of face coverings can make communication more difficult by removing access to visual cues as well as affecting the physical transmission of speech sounds. This study aimed to assess the independent and combined contributions of visual and auditory cues to impaired communication when using face coverings. In an online task, 150 participants rated videos of natural conversation along three dimensions: (1) how much they could follow, (2) how much effort was required, and (3) the clarity of the speech. Visual and audio variables were independently manipulated in each video, so that the same video could be presented with or without a superimposed surgical-style mask, accompanied by one of four audio conditions (either unfiltered audio, or audio-filtered to simulate the attenuation associated with a surgical mask, an FFP3 mask, or a visor). Hypotheses and analyses were pre-registered. Both the audio and visual variables had a statistically significant negative impact across all three dimensions. Whether or not talkers' faces were visible made the largest contribution to participants' ratings. The study identifies a degree of attenuation whose negative effects can be overcome by the restoration of visual cues. The significant effects observed in this nominally low-demand task (speech in quiet) highlight the importance of the visual and audio cues in everyday life and that their consideration should be included in future face mask designs.

Assuntos

Sinais (Psicologia) , Percepção da Fala , Humanos , Adulto , Feminino , Masculino , Adulto Jovem , Percepção da Fala/fisiologia , Percepção Visual/fisiologia , Máscaras , Adolescente , Fala/fisiologia , Comunicação , Pessoa de Meia-Idade , Reconhecimento Facial/fisiologia

7.

Perceptual formant discrimination during speech movement planning.

Wang, Hantao; Ali, Yusuf; Max, Ludo.

PLoS One ; 19(4): e0301514, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38564597

RESUMO

Evoked potential studies have shown that speech planning modulates auditory cortical responses. The phenomenon's functional relevance is unknown. We tested whether, during this time window of cortical auditory modulation, there is an effect on speakers' perceptual sensitivity for vowel formant discrimination. Participants made same/different judgments for pairs of stimuli consisting of a pre-recorded, self-produced vowel and a formant-shifted version of the same production. Stimuli were presented prior to a "go" signal for speaking, prior to passive listening, and during silent reading. The formant discrimination stimulus /uh/ was tested with a congruent productions list (words with /uh/) and an incongruent productions list (words without /uh/). Logistic curves were fitted to participants' responses, and the just-noticeable difference (JND) served as a measure of discrimination sensitivity. We found a statistically significant effect of condition (worst discrimination before speaking) without congruency effect. Post-hoc pairwise comparisons revealed that JND was significantly greater before speaking than during silent reading. Thus, formant discrimination sensitivity was reduced during speech planning regardless of the congruence between discrimination stimulus and predicted acoustic consequences of the planned speech movements. This finding may inform ongoing efforts to determine the functional relevance of the previously reported modulation of auditory processing during speech planning.

Assuntos

Córtex Auditivo , Percepção da Fala , Humanos , Fala/fisiologia , Percepção da Fala/fisiologia , Acústica , Movimento , Fonética , Acústica da Fala

8.

Infants' brain responses to social interaction predict future language growth.

Bosseler, Alexis N; Meltzoff, Andrew N; Bierer, Steven; Huber, Elizabeth; Mizrahi, Julia C; Larson, Eric; Endevelt-Shapira, Yaara; Taulu, Samu; Kuhl, Patricia K.

Curr Biol ; 34(8): 1731-1738.e3, 2024 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-38593800

RESUMO

In face-to-face interactions with infants, human adults exhibit a species-specific communicative signal. Adults present a distinctive "social ensemble": they use infant-directed speech (parentese), respond contingently to infants' actions and vocalizations, and react positively through mutual eye-gaze and smiling. Studies suggest that this social ensemble is essential for initial language learning. Our hypothesis is that the social ensemble attracts attentional systems to speech and that sensorimotor systems prepare infants to respond vocally, both of which advance language learning. Using infant magnetoencephalography (MEG), we measure 5-month-old infants' neural responses during live verbal face-to-face (F2F) interaction with an adult (social condition) and during a control (nonsocial condition) in which the adult turns away from the infant to speak to another person. Using a longitudinal design, we tested whether infants' brain responses to these conditions at 5 months of age predicted their language growth at five future time points. Brain areas involved in attention (right hemisphere inferior frontal, right hemisphere superior temporal, and right hemisphere inferior parietal) show significantly higher theta activity in the social versus nonsocial condition. Critical to theory, we found that infants' neural activity in response to F2F interaction in attentional and sensorimotor regions significantly predicted future language development into the third year of life, more than 2 years after the initial measurements. We develop a view of early language acquisition that underscores the centrality of the social ensemble, and we offer new insight into the neurobiological components that link infants' language learning to their early brain functioning during social interaction.

Assuntos

Encéfalo , Desenvolvimento da Linguagem , Magnetoencefalografia , Interação Social , Humanos , Lactente , Masculino , Feminino , Encéfalo/fisiologia , Atenção/fisiologia , Fala/fisiologia

9.

Crossmixed convolutional neural network for digital speech recognition.

Diep, Quoc Bao; Phan, Hong Yen; Truong, Thanh-Cong.

PLoS One ; 19(4): e0302394, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38669233

RESUMO

Digital speech recognition is a challenging problem that requires the ability to learn complex signal characteristics such as frequency, pitch, intensity, timbre, and melody, which traditional methods often face issues in recognizing. This article introduces three solutions based on convolutional neural networks (CNN) to solve the problem: 1D-CNN is designed to learn directly from digital data; 2DS-CNN and 2DM-CNN have a more complex architecture, transferring raw waveform into transformed images using Fourier transform to learn essential features. Experimental results on four large data sets, containing 30,000 samples for each, show that the three proposed models achieve superior performance compared to well-known models such as GoogLeNet and AlexNet, with the best accuracy of 95.87%, 99.65%, and 99.76%, respectively. With 5-10% higher performance than other models, the proposed solution has demonstrated the ability to effectively learn features, improve recognition accuracy and speed, and open up the potential for broad applications in virtual assistants, medical recording, and voice commands.

Assuntos

Redes Neurais de Computação , Interface para o Reconhecimento da Fala , Humanos , Fala/fisiologia , Algoritmos

10.

Cluster-Based Pairwise Contrastive Loss for Noise-Robust Speech Recognition.

Lee, Geon Woo; Kim, Hong Kook.

Sensors (Basel) ; 24(8)2024 Apr 17.

Artigo em Inglês | MEDLINE | ID: mdl-38676191

RESUMO

This paper addresses a joint training approach applied to a pipeline comprising speech enhancement (SE) and automatic speech recognition (ASR) models, where an acoustic tokenizer is included in the pipeline to leverage the linguistic information from the ASR model to the SE model. The acoustic tokenizer takes the outputs of the ASR encoder and provides a pseudo-label through K-means clustering. To transfer the linguistic information, represented by pseudo-labels, from the acoustic tokenizer to the SE model, a cluster-based pairwise contrastive (CBPC) loss function is proposed, which is a self-supervised contrastive loss function, and combined with an information noise contrastive estimation (infoNCE) loss function. This combined loss function prevents the SE model from overfitting to outlier samples and represents the pronunciation variability in samples with the same pseudo-label. The effectiveness of the proposed CBPC loss function is evaluated on a noisy LibriSpeech dataset by measuring both the speech quality scores and the word error rate (WER). The experimental results reveal that the proposed joint training approach using the described CBPC loss function achieves a lower WER than the conventional joint training approaches. In addition, it is demonstrated that the speech quality scores of the SE model trained using the proposed training approach are higher than those of the standalone-SE model and SE models trained using conventional joint training approaches. An ablation study is also conducted to investigate the effects of different combinations of loss functions on the speech quality scores and WER. Here, it is revealed that the proposed CBPC loss function combined with infoNCE contributes to a reduced WER and an increase in most of the speech quality scores.

Assuntos

Ruído , Interface para o Reconhecimento da Fala , Humanos , Análise por Conglomerados , Algoritmos , Fala/fisiologia

11.

Identification of the Biomechanical Response of the Muscles That Contract the Most during Disfluencies in Stuttered Speech.

Marin, Edu; Unsihuay, Nicole; Abarca, Victoria E; Elias, Dante A.

Sensors (Basel) ; 24(8)2024 Apr 20.

Artigo em Inglês | MEDLINE | ID: mdl-38676246

RESUMO

Stuttering, affecting approximately 1% of the global population, is a complex speech disorder significantly impacting individuals' quality of life. Prior studies using electromyography (EMG) to examine orofacial muscle activity in stuttering have presented mixed results, highlighting the variability in neuromuscular responses during stuttering episodes. Fifty-five participants with stuttering and 30 individuals without stuttering, aged between 18 and 40, participated in the study. EMG signals from five facial and cervical muscles were recorded during speech tasks and analyzed for mean amplitude and frequency activity in the 5-15 Hz range to identify significant differences. Upon analysis of the 5-15 Hz frequency range, a higher average amplitude was observed in the zygomaticus major muscle for participants while stuttering (p < 0.05). Additionally, when assessing the overall EMG signal amplitude, a higher average amplitude was observed in samples obtained from disfluencies in participants who did not stutter, particularly in the depressor anguli oris muscle (p < 0.05). Significant differences in muscle activity were observed between the two groups, particularly in the depressor anguli oris and zygomaticus major muscles. These results suggest that the underlying neuromuscular mechanisms of stuttering might involve subtle aspects of timing and coordination in muscle activation. Therefore, these findings may contribute to the field of biosensors by providing valuable perspectives on neuromuscular mechanisms and the relevance of electromyography in stuttering research. Further research in this area has the potential to advance the development of biosensor technology for language-related applications and therapeutic interventions in stuttering.

Assuntos

Eletromiografia , Músculos Faciais , Fala , Gagueira , Humanos , Eletromiografia/métodos , Masculino , Adulto , Feminino , Gagueira/fisiopatologia , Fala/fisiologia , Músculos Faciais/fisiologia , Músculos Faciais/fisiopatologia , Fenômenos Biomecânicos/fisiologia , Adulto Jovem , Adolescente , Contração Muscular/fisiologia

12.

SpEx: a German-language dataset of speech and executive function performance.

Camilleri, Julia A; Volkening, Julia; Heim, Stefan; Mochalski, Lisa N; Neufeld, Hannah; Schlothauer, Natalie; Kuhles, Gianna; Eickhoff, Simon B; Weis, Susanne.

Sci Rep ; 14(1): 9431, 2024 04 24.

Artigo em Inglês | MEDLINE | ID: mdl-38658576

RESUMO

This work presents data from 148 German native speakers (20-55 years of age), who completed several speaking tasks, ranging from formal tests such as word production tests to more ecologically valid spontaneous tasks that were designed to mimic natural speech. This speech data is supplemented by performance measures on several standardised, computer-based executive functioning (EF) tests covering domains of working-memory, cognitive flexibility, inhibition, and attention. The speech and EF data are further complemented by a rich collection of demographic data that documents education level, family status, and physical and psychological well-being. Additionally, the dataset includes information of the participants' hormone levels (cortisol, progesterone, oestradiol, and testosterone) at the time of testing. This dataset is thus a carefully curated, expansive collection of data that spans over different EF domains and includes both formal speaking tests as well as spontaneous speaking tasks, supplemented by valuable phenotypical information. This will thus provide the unique opportunity to perform a variety of analyses in the context of speech, EF, and inter-individual differences, and to our knowledge is the first of its kind in the German language. We refer to this dataset as SpEx since it combines speech and executive functioning data. Researchers interested in conducting exploratory or hypothesis-driven analyses in the field of individual differences in language and executive functioning, are encouraged to request access to this resource. Applicants will then be provided with an encrypted version of the data which can be downloaded.

Assuntos

Função Executiva , Fala , Humanos , Função Executiva/fisiologia , Adulto , Pessoa de Meia-Idade , Feminino , Masculino , Fala/fisiologia , Alemanha , Adulto Jovem , Idioma , Memória de Curto Prazo/fisiologia , Testes Neuropsicológicos

13.

Considerations for implanting speech brain computer interfaces based on functional magnetic resonance imaging.

Guerreiro Fernandes, F; Raemaekers, M; Freudenburg, Z; Ramsey, N.

J Neural Eng ; 21(3)2024 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-38648782

RESUMO

Objective.Brain-computer interfaces (BCIs) have the potential to reinstate lost communication faculties. Results from speech decoding studies indicate that a usable speech BCI based on activity in the sensorimotor cortex (SMC) can be achieved using subdurally implanted electrodes. However, the optimal characteristics for a successful speech implant are largely unknown. We address this topic in a high field blood oxygenation level dependent functional magnetic resonance imaging (fMRI) study, by assessing the decodability of spoken words as a function of hemisphere, gyrus, sulcal depth, and position along the ventral/dorsal-axis.Approach.Twelve subjects conducted a 7T fMRI experiment in which they pronounced 6 different pseudo-words over 6 runs. We divided the SMC by hemisphere, gyrus, sulcal depth, and position along the ventral/dorsal axis. Classification was performed on in these SMC areas using multiclass support vector machine (SVM).Main results.Significant classification was possible from the SMC, but no preference for the left or right hemisphere, nor for the precentral or postcentral gyrus for optimal word classification was detected. Classification while using information from the cortical surface was slightly better than when using information from deep in the central sulcus and was highest within the ventral 50% of SMC. Confusion matrices where highly similar across the entire SMC. An SVM-searchlight analysis revealed significant classification in the superior temporal gyrus and left planum temporale in addition to the SMC.Significance.The current results support a unilateral implant using surface electrodes, covering the ventral 50% of the SMC. The added value of depth electrodes is unclear. We did not observe evidence for variations in the qualitative nature of information across SMC. The current results need to be confirmed in paralyzed patients performing attempted speech.

Assuntos

Interfaces Cérebro-Computador , Imageamento por Ressonância Magnética , Fala , Humanos , Imageamento por Ressonância Magnética/métodos , Masculino , Adulto , Feminino , Fala/fisiologia , Adulto Jovem , Eletrodos Implantados , Mapeamento Encefálico/métodos

14.

Auditory Encoding of Natural Speech at Subcortical and Cortical Levels Is Not Indicative of Cognitive Decline.

Bolt, Elena; Giroud, Nathalie.

eNeuro ; 11(5)2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38658138

RESUMO

More and more patients worldwide are diagnosed with dementia, which emphasizes the urgent need for early detection markers. In this study, we built on the auditory hypersensitivity theory of a previous study-which postulated that responses to auditory input in the subcortex as well as cortex are enhanced in cognitive decline-and examined auditory encoding of natural continuous speech at both neural levels for its indicative potential for cognitive decline. We recruited study participants aged 60 years and older, who were divided into two groups based on the Montreal Cognitive Assessment, one group with low scores (n = 19, participants with signs of cognitive decline) and a control group (n = 25). Participants completed an audiometric assessment and then we recorded their electroencephalography while they listened to an audiobook and click sounds. We derived temporal response functions and evoked potentials from the data and examined response amplitudes for their potential to predict cognitive decline, controlling for hearing ability and age. Contrary to our expectations, no evidence of auditory hypersensitivity was observed in participants with signs of cognitive decline; response amplitudes were comparable in both cognitive groups. Moreover, the combination of response amplitudes showed no predictive value for cognitive decline. These results challenge the proposed hypothesis and emphasize the need for further research to identify reliable auditory markers for the early detection of cognitive decline.

Assuntos

Disfunção Cognitiva , Eletroencefalografia , Potenciais Evocados Auditivos , Humanos , Feminino , Masculino , Idoso , Disfunção Cognitiva/fisiopatologia , Disfunção Cognitiva/diagnóstico , Pessoa de Meia-Idade , Potenciais Evocados Auditivos/fisiologia , Percepção da Fala/fisiologia , Idoso de 80 Anos ou mais , Córtex Cerebral/fisiologia , Córtex Cerebral/fisiopatologia , Estimulação Acústica , Fala/fisiologia

15.

Machine learning decoding of single neurons in the thalamus for speech brain-machine interfaces.

Tankus, Ariel; Rosenberg, Noam; Ben-Hamo, Oz; Stern, Einat; Strauss, Ido.

J Neural Eng ; 21(3)2024 May 09.

Artigo em Inglês | MEDLINE | ID: mdl-38648783

RESUMO

Objective. Our goal is to decode firing patterns of single neurons in the left ventralis intermediate nucleus (Vim) of the thalamus, related to speech production, perception, and imagery. For realistic speech brain-machine interfaces (BMIs), we aim to characterize the amount of thalamic neurons necessary for high accuracy decoding.Approach. We intraoperatively recorded single neuron activity in the left Vim of eight neurosurgical patients undergoing implantation of deep brain stimulator or RF lesioning during production, perception and imagery of the five monophthongal vowel sounds. We utilized the Spade decoder, a machine learning algorithm that dynamically learns specific features of firing patterns and is based on sparse decomposition of the high dimensional feature space.Main results. Spade outperformed all algorithms compared with, for all three aspects of speech: production, perception and imagery, and obtained accuracies of 100%, 96%, and 92%, respectively (chance level: 20%) based on pooling together neurons across all patients. The accuracy was logarithmic in the amount of neurons for all three aspects of speech. Regardless of the amount of units employed, production gained highest accuracies, whereas perception and imagery equated with each other.Significance. Our research renders single neuron activity in the left Vim a promising source of inputs to BMIs for restoration of speech faculties for locked-in patients or patients with anarthria or dysarthria to allow them to communicate again. Our characterization of how many neurons are necessary to achieve a certain decoding accuracy is of utmost importance for planning BMI implantation.

Assuntos

Interfaces Cérebro-Computador , Aprendizado de Máquina , Neurônios , Fala , Tálamo , Humanos , Neurônios/fisiologia , Masculino , Feminino , Pessoa de Meia-Idade , Fala/fisiologia , Adulto , Tálamo/fisiologia , Estimulação Encefálica Profunda/métodos , Idoso , Percepção da Fala/fisiologia

16.

Comparing Two Smoothing Approaches in Estimating Kinematic Parameters.

Kuberski, Stephan R; Gafos, Adamantios I.

J Speech Lang Hear Res ; 67(5): 1400-1412, 2024 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-38573836

RESUMO

PURPOSE: We compare two signal smoothing and differentiation approaches: a frequently used approach in the speech community of digital filtering with approximation of derivatives by finite differences and a spline smoothing approach widely used in other fields of human movement science. METHOD: In particular, we compare the values of a classic set of kinematic parameters estimated by the two smoothing approaches and assess, via regressions, how well these reconstructed values conform to known laws about relations between the parameters. RESULTS: Substantially smaller regression errors were observed for the spline smoothing than for the filtering approach. CONCLUSION: This result is in broad agreement with reports from other fields of movement science and underpins the superiority of splines also in the domain of speech.

Assuntos

Fala , Humanos , Fenômenos Biomecânicos , Fala/fisiologia , Análise de Regressão , Processamento de Sinais Assistido por Computador

17.

The Feel of Speech: Multisystem and Polymodal Somatosensation in Speech Production.

Kent, Raymond D.

J Speech Lang Hear Res ; 67(5): 1424-1460, 2024 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-38593006

RESUMO

PURPOSE: The oral structures such as the tongue and lips have remarkable somatosensory capacities, but understanding the roles of somatosensation in speech production requires a more comprehensive knowledge of somatosensation in the speech production system in its entirety, including the respiratory, laryngeal, and supralaryngeal subsystems. This review was conducted to summarize the system-wide somatosensory information available for speech production. METHOD: The search was conducted with PubMed/Medline and Google Scholar for articles published until November 2023. Numerous search terms were used in conducting the review, which covered the topics of psychophysics, basic and clinical behavioral research, neuroanatomy, and neuroscience. RESULTS AND CONCLUSIONS: The current understanding of speech somatosensation rests primarily on the two pillars of psychophysics and neuroscience. The confluence of polymodal afferent streams supports the development, maintenance, and refinement of speech production. Receptors are both canonical and noncanonical, with the latter occurring especially in the muscles innervated by the facial nerve. Somatosensory representation in the cortex is disproportionately large and provides for sensory interactions. Speech somatosensory function is robust over the lifespan, with possible declines in advanced aging. The understanding of somatosensation in speech disorders is largely disconnected from research and theory on speech production. A speech somatoscape is proposed as the generalized, system-wide sensation of speech production, with implications for speech development, speech motor control, and speech disorders.

Assuntos

Fala , Humanos , Fala/fisiologia , Lábio/fisiologia , Língua/fisiologia

18.

The Predictability of Naturalistic Evaluation of All-Day Recordings for Speech and Language Development.

Ha, Seunghee.

J Speech Lang Hear Res ; 67(5): 1370-1384, 2024 May 07.

Artigo em Inglês | MEDLINE | ID: mdl-38619435

RESUMO

OBJECTIVES: The study aimed to investigate the predictive potential of language environment and vocal development status measures obtained through integrated analysis of Language ENvironment Analysis (LENA) recordings during the prelinguistic stage for subsequent speech and language development in Korean-acquiring children. Specifically, this study explored whether measures from both LENA-automated analysis and human coding at 6-8 months and 12-14 months of age predict vocabulary and phonological development at 18-20 months. METHOD: One-day home recordings from 20 children were collected using a LENA recorder at 6-8 months, 12-14 months, and 18-20 months. Both LENA-automated measures and measures from human coding were obtained from recordings at 6-8 months and 12-14 months. The number of different words, consonant inventory, and utterance structure inventory were identified from recordings of 18-20 months. Correlation and multiple regression analyses were performed to investigate whether measures related to early language environment and child vocalization at 6-8 months and 12-14 months were predictive of vocabulary and phonological measures at 18-20 months. RESULTS: The results showed that the two main LENA-automated measures, conversational turn count (CTC) and child vocalization count, were positively correlated with all vocabulary and phonological measures at 18-20 months. Multiple regression analysis revealed that CTC during the prelinguistic stages was the most significant predictor of a number of different words, consonant inventory, and utterance structure inventory at 18-20 months. Also, adult word count in LENA-automated measures, child-directed speech ratio, and canonical babbling ratio measured by human coding significantly predicted some vocabulary and phonological measures at 18-20 months. CONCLUSION: This study highlights the multifaceted nature of language acquisition and collectively emphasizes the value of considering both quantitative and qualitative aspects of language input to understand early language development in children.

Assuntos

Linguagem Infantil , Desenvolvimento da Linguagem , Fala , Vocabulário , Humanos , Masculino , Feminino , Lactente , Fala/fisiologia , Fonética , Medida da Produção da Fala/métodos

19.

Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS.

Angrick, Miguel; Luo, Shiyu; Rabbani, Qinwan; Candrea, Daniel N; Shah, Samyak; Milsap, Griffin W; Anderson, William S; Gordon, Chad R; Rosenblatt, Kathryn R; Clawson, Lora; Tippett, Donna C; Maragakis, Nicholas; Tenore, Francesco V; Fifer, Matthew S; Hermansky, Hynek; Ramsey, Nick F; Crone, Nathan E.

Sci Rep ; 14(1): 9617, 2024 04 26.

Artigo em Inglês | MEDLINE | ID: mdl-38671062

RESUMO

Brain-computer interfaces (BCIs) that reconstruct and synthesize speech using brain activity recorded with intracranial electrodes may pave the way toward novel communication interfaces for people who have lost their ability to speak, or who are at high risk of losing this ability, due to neurological disorders. Here, we report online synthesis of intelligible words using a chronically implanted brain-computer interface (BCI) in a man with impaired articulation due to ALS, participating in a clinical trial (ClinicalTrials.gov, NCT03567213) exploring different strategies for BCI communication. The 3-stage approach reported here relies on recurrent neural networks to identify, decode and synthesize speech from electrocorticographic (ECoG) signals acquired across motor, premotor and somatosensory cortices. We demonstrate a reliable BCI that synthesizes commands freely chosen and spoken by the participant from a vocabulary of 6 keywords previously used for decoding commands to control a communication board. Evaluation of the intelligibility of the synthesized speech indicates that 80% of the words can be correctly recognized by human listeners. Our results show that a speech-impaired individual with ALS can use a chronically implanted BCI to reliably produce synthesized words while preserving the participant's voice profile, and provide further evidence for the stability of ECoG for speech-based BCIs.

Assuntos

Esclerose Lateral Amiotrófica , Interfaces Cérebro-Computador , Fala , Humanos , Esclerose Lateral Amiotrófica/fisiopatologia , Esclerose Lateral Amiotrófica/terapia , Masculino , Fala/fisiologia , Pessoa de Meia-Idade , Eletrodos Implantados , Eletrocorticografia

20.

Exploring inter-trial coherence for inner speech classification in EEG-based brain-computer interface.

Lopez-Bernal, Diego; Balderas, David; Ponce, Pedro; Molina, Arturo.

J Neural Eng ; 21(2)2024 Apr 26.

Artigo em Inglês | MEDLINE | ID: mdl-38626760

RESUMO

Objective. In recent years, electroencephalogram (EEG)-based brain-computer interfaces (BCIs) applied to inner speech classification have gathered attention for their potential to provide a communication channel for individuals with speech disabilities. However, existing methodologies for this task fall short in achieving acceptable accuracy for real-life implementation. This paper concentrated on exploring the possibility of using inter-trial coherence (ITC) as a feature extraction technique to enhance inner speech classification accuracy in EEG-based BCIs.Approach. To address the objective, this work presents a novel methodology that employs ITC for feature extraction within a complex Morlet time-frequency representation. The study involves a dataset comprising EEG recordings of four different words for ten subjects, with three recording sessions per subject. The extracted features are then classified using k-nearest-neighbors (kNNs) and support vector machine (SVM).Main results. The average classification accuracy achieved using the proposed methodology is 56.08% for kNN and 59.55% for SVM. These results demonstrate comparable or superior performance in comparison to previous works. The exploration of inter-trial phase coherence as a feature extraction technique proves promising for enhancing accuracy in inner speech classification within EEG-based BCIs.Significance. This study contributes to the advancement of EEG-based BCIs for inner speech classification by introducing a feature extraction methodology using ITC. The obtained results, on par or superior to previous works, highlight the potential significance of this approach in improving the accuracy of BCI systems. The exploration of this technique lays the groundwork for further research toward inner speech decoding.

Assuntos

Interfaces Cérebro-Computador , Eletroencefalografia , Fala , Humanos , Eletroencefalografia/métodos , Eletroencefalografia/classificação , Masculino , Fala/fisiologia , Feminino , Adulto , Máquina de Vetores de Suporte , Adulto Jovem , Reprodutibilidade dos Testes , Algoritmos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA